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Writing as 
a Linguistic Problem 

Deborah McCutchen 
University of Pittsburgh 

This article presents a psycholinguistic view of writing that focuses on the 
processes of translating concepts into sentences. The current research em- 
phasis on planning is discussed in terms of its theoretical roots in artificial 
intelligence models of planning and the limited applicability of such 
models for writing. A case is made for viewing writing not only as a plan* 
ning problem but as a linguistic problem that can benefit from work in 
reading and speech production. 



As writing has become an area of cognitive 
inquiry, the related research has acquired a 
peculiar character. Some of the best known 
cognitive work compares the composing pro- 
cess with problem solving (Collins & Gentner, 
1980; Hayes & Flower, I98C; Nold, 1981), and 
within the problem solving framework, writing 
has become yet another domain in which the 
importance of high level planning can be 
demonstrated. Thus, writing is clustered with 
physics and other problem solving domains 
and separated from linguistic processes such as 
speech production and reading to which it in- 
tuitively seems related. Much of the cognitive 
work on writing has focused on high level 
planning and abstract goals (e.g., Burtis, 
Bereiter, Scardamalia, & Tetroe, in press; 
Rower & Hayes, 1980, 1981a; Matsuhashi, 
1982; Scardamailia & Bereiter, 1982), to the 
extent that problem solving heuristics are ad- 
vocated to students as ways to improve their 
writing (Flower, 1981). 

This article will argue that the current focus 
on high level planning and abstract go? Is runs 
the risk of misrepresenting the contributions 
that cognitive psychology could make to the 
study of writing if it neglects important 
linguistic features that distinguish the writing 
of natural language from other problem solv- 
ing tasks. Planning — as it is frequently 
discussed — seems to end just where much of 
the real problem of writing begins, and little 
attention is given to the on-line processes 
Preparation of this manuscript and the research 
presented here was supported by the Learning Research 
and Development Center, which is supported in part by 
the National Institute of Education. The author also 
wishes to thank Charles A. Perfetti for discussions of the 
Bsues presented here and hi helpful comments on an 
earlier version of this manuscript. 

The address of Deborah McCutchen is: Learning Re* 
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involved in linear sentence generation. Left 
unspecified are the processes that make writing 
a unique problem: the generation of extended, 
coherent language. 



Contributions and Limitations 
of Current Approaches 

The problem solving framework has enabled 
important insights into writing as a process, and 
perhaps the most important of those insights 
have concerned the interactive nature of the 
writing process (Hayes & Flower, 1980). So 
much recent empirical work, however, has 
centered on planning that other processes which 
interact with planning have been neglected. 
This is not to say that these other processes are 
totally ignored by researchers. In much of the 
work cited earlier and in other cognitive work 
(e.g., E.J. Bartlett, 1982; Beaugrande, 1982; 
Bereiter & Scardamalia, 1981; Flower & Hayes, 
1984; Shuy, 1981), there are numerous 
acknowledgements of the importance of 
language-based processes in writing. Flower 
(1981) allots eight pages of her tutorial on 
writing to such linguistic concerns. However, 
the problem solving perspective and its empha- 
sis on planning are so dominant in the popular 
perception — formed by treatments such as 
Flower's (1981) — that little attention is given 
to the varied ways in which cognitive science can 
inform the study of writing. Planning is certain* 
ly important in writing, but a well-planned text 
is not necessarily a well-written one. 

In what follows, an alternative perspective 
on writing is developed — one that em- 
phasizes psycholinguistic processes involved in 
generating sentences and linking them into 
coherent text. However, before that perspec- 
tive is presented in detail, the potential 
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limitations of 100 narrow a focus on planning 
need to be understood We begin by tracing 
the theoretical roots of the problem solving ap- 
proach, in order to explain its somewhat 
natural emphasis on planning, and by examin- 
ing the applicability of planning models to 
writing. 



Problem Solving Approaches 
Theoretical roots 

The problem solving models of writing have 
developed out of work in cognitive science. 
These and other studies of human complex- 
problem-solving behavior (e.g.! Hayes-Roth 8c 
Hayes-Roth, 1979; Jeffries, Turner, Poison, 8c 
Atwood, 1981; Larkin, McDermott, Simon, 8c 
Simon, 1980; Voss, Greene, Post, APenner, 
1983) have examined aspects of human 
^havior for which quantifiable specifications 
previously had been few. They generally have 
followed the example of artificial intelligence 
(AI) models of problem solving, which em- 
phasize problem decomposition, with plan- 
ning as an especially important subprocess. 
(See Cohen 8c Feigenbaum, 1982, for a detail- 
ed summary.) The importance of planning in 
most complex tasks is well recognized, and its 
central role in writing is emphasized not only 
in the work of cognitive researchers but in 
tradition^ rhetoric texts as well (e.g., Skwire, 
Chitwood, Ackley, 8c Frednan, 1975). 
However, when the nature of the planning in- 
volved in writing is compared with the plan- 
ning done by AI models, some interesting dif- 
ferences emerge. 

How well does writing fit typical planning 
models? To answer this question, let us ex- 
amine first some AI models in which the 
underlying process assumptions are well laid 
out. Then we will examine how well these AI 
models describe human problem solving in 
various domains, and finally how well they 
describe writing. 

AI planners. The most well-specified models 
are those planning systems developed in the 
area of AI, and these planners differ according 
to the number of levels of abstraction permit- 
ted in the problem representation. That is, 
how much of the problem solving is done in 
the abstract, before local details are specified? 
In this respect, there are important implica- 
tions for models of human problem solving, 



and for writing especially. While AI models 
are not typically intended to be simulations of 
human processing, their feasibility as such is 
interesting io examine because, ultimately, 
any model of the writing process should be as 
well specified as these AI models. 

AI planners conventionally described as 
nonhiararthical STRIPS, HACKER, IN- 
TERPLAN; See Cohen & Feigenbaum, 1982) 
represent the problem at a single level ar.d 
thus do no abstract planning. They begin solv- 
ing the initial subgoal and continue working 
linearly, on the assumption that early decisions 
are independent of later ones. Because of their 
lack of foresight, they are generally less power- 
ful problem solvers. In these systems, critical 
steps in the solution process are not 
distinguished from trivial ones, and a con- 
siderable amount of work can be dome (i.e., 
many subgoals created and satisfied) before 
the planner confronts a critical goal that can- 
not be achieved because of a trivial early deci- 
sion. This requires a good deal of extra process- 
ing: backtracking to a critical choice point un- 
doing and then redoing the solution process. 

Other AI planning systems described as 
hierarchical planners (e.g. , NOAH, 
MOLGEN; See Cohen 8c Feigenbaum, 1982; 
Stefik, 1981a, 1981b) avoid backtracking by 
working at multiple levels of abstraction in 
their initial plans. Planners of this type follow 
the least commitment principle, postponing 
decisions about details until a proposed 
abstract solution is shown not to result in in- 
terference among important decisions. Poten- 
tial conflicts in the plan can thus be detected 
early and corrected before the costly work of 
solving subproblems in detail is done. These 
procedures, together with extensive knowledge 
of the specific problem domain (see especially 
MOLGEN, by Stefik, 1981a, 1981b), make 
hierarchical planners very powerful and thus 
quite popular in AI research. 

Human problem solving. Only a few ex- 
amples of human problem solving, however, 
seem as hierarchical in nature as the more 
powerful AI planners. Experts in sof:ware 
design 0 cffrics « ! 9 81 ) and pby«« 
(Larkin et al., 1980) seem to decompose com- 
plex problems into classifiable problem types 
with recognizable solutions (recognizable at 
least to these experts). Novices, however, lack- 
ing the rich knowledge base of the experts, are 
typically less successful in decomposing and 
classifying the problems and seem to be forced 
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into nonhierarchical plai ning, forced simply 
to begin wherever they can and see how their 
solutions work out. 

In other problem domains, even expert prob- 
lem solving looks less hierarchical. Voss et al. 
(1983) found that the solutions generated by 
their social science experts did not always 
emerge neatly out of hierarchical refinements of 
abstract representations. Often subgoals were 
unexpectedly encountered and solved during 
the evaluation of implications of another 
subgoal. Such multidirectional, "oppor- 
tunistic" problem solving characterizes other 
human problem solving performance (Hayes- 
Roth 8c Hayes-R jth, 1979). i \ these descrip- 
tions, human solutions evolve incrementally 
from subgoals at various levels rather than from 
orderly problem refinement at any given level, 
with low-level constraints sometimes being 
dealt with before more abstract ones. 



Applications to Writing 

Planning in writing. What is the nature of 
the planning done by writers? Many tradi- 
tional composition texts present writing as a 
task d -tctly amenable to hierarchical plan- 
ning, with multiple levels of problem decom- 
position (e g , Skwire et al., 1975). This is 
reflected in the frequent suggestion to the stu- 
dent to first create an outline of the composi- 
tion. A topic is to be chosen, and the paper 
divided into introduction, body, and conclu- 
sion. The body of the paper is further sub- 
divided into paragraphs of thesis support, each 
making a main point stated in the paragraph's 
topic sentence and supported in its body. All 
that then remains is the "fleshing out" of the 
outline. (See also Emig, 1971, for a related 
discussion of rhetoric texts.) 

Thrre are limits, however, to the ap- 
propriateness of hierarchical models for 
writing. Recall that hierarchical planners 
operate according to the least commitment 
principle keeping variables unspecified for as 
bng as possible. Only so much planning of a 
composition, however, can be done in the 
abstract, even by skilled writers. Relatively ear- 
ly the writer is forced to define variables (i.e., 
to actually write a sentence or a few words), 
and this often occurs before every paragraph is 
fully planned and waiting to be "dressed" in 
the appropriate words. Words already written 
can drastically affect what follows them; in 
fact, they must if smooth transitions and local 



coherence sure to be maintained (Halliday & 
Hasan, 1976) With such early constraints on 
variables, the writer loses the power of die 
hierarchical planners. The writer is forced, at 
some point in 'he actual generation of 
sentences, to follow the linearity assumption 
typical of nonhierarchical planners, choosing to 
begin with something and following it, 
sometimes to a preplanned next idea, 
sometimes to a newly discovered thought, and 
sometimes to a dead end. The nonhierarchical 
aspects of writing are all too many, as data from 
Hayes and Flower's (1980; Flower & Hayes, 
1981b) protocol studies reveal. Low level editing 
frequently interrupts the planning and 
generating processes, and in most research on 
writers' actual composing behavior, emphasis is 
placed on the interactive nature (in the 
psychological sense of multiple information 
sources) of the subprocesses of writing (E. J, 
Bartlett, 1982; Beaugrande, 1982; Burtiset al., 
in press; Emig, 1971; Hayes & Flower, 1980; 
Matsuhashi, 1982; Nold, 1981; Shuy, 1981), 

The interactive nature of writing. The interac- 
tion between text-level processes and planning- 
level processes is well illustrated in the following 
excerpts from the protocol of a writer of a 
newspaper wine column. This writer began the 
protocol with a well formed plan concerning au- 
dience and style, even specifying the structure of 
hi? column about a tasting of wines from 
Chateau Latour: "The general structure has got 
to be, we've got to give them some information 
about Chateau Latour, make it kind of real to 
them, give them something to chew on, and 
then we'te going to go through the tasting notes 
. . . . " Even with this plan, however, text-level 
decisions had to be made, as his protocol shows. 

In Figure 1, the writer knows the content he 
wants to express, but it is the expression, in 
linearly structured sentences, that gives him 
trouble. (The section of text on which the 
writer is working is presented on the right side 
of the figure, and the writer's comments on 
the left.) 

The writer's plan had specified that he give 
some information about Latour, specifically 
tHt 80% of the Latour vineyards are planted 
tn cabernet grapes and that this is the source of 
the wine's longevity. However, an appropriate 
sentence structure coordinating those two ideas 
does not just fall out of his semantic plan, and 
we see the writer try one alternative after 
another. Constructing appropriate sen :nces is 
part of the writing task. 
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PROTOCOL 



That doesn't read too well at all, but it's the 
right idea. So why not break it up . . . 



Now we can get the 80% in, OK? (reads) 
M 80% of the grapes? Cabernet? 80%" I 
want to say it's the vineyard that's cabernet, 
but we just said it's the vineyard, and that 
gets boring with too many" vineyards." . . . 
So why not just make it a nice run-on 
sentence . . That way we don't have to 
repeat. 



TEXT 

. . . Probably one of the major reasons for 
the longevity of the wines lies in the 80% 
cabernet sauvignon grapes 



(edits) 
80% 



lies in the vineyards of Latour 



(edits) . . lies in the vineyards of Latour, 
80% of which are given over to the cabernet 
grape. 

Figure /. Adult writer's protocol and text as the writer works out sentence syntax. 



PROTOCOL 



Well now, how would one describe the 
grape? Wildness? See, what we have to do 
now is tell them why it is that cabernet 
sauvignon gives it the longevity. Why does 
it? Because it is a hard grape. It takes a long 
time to come around. OK . . . (reviews) 
"This is the grape that . . ." ah, "provides 
backbone . . ." 



TEXT 

... the vineyards of Latour, 80% of which 
are given over to the cabernet sauvignon 
grape, (types) This is the grape used in the 
hardest 



(edits) This is the grape that provides 
backbone . . 



Figure 2. Adult writer's protocol and text as the writer uses the text to refine conceptual plan. 



The protocol continues in Figure 2, and here 
we can see that the writer's linear generation 
processes have outrun his plan for content. He 
then uses the text he has written to better for- 
mulate his idea and help retrieve content. 

It is probably not a coincidence that the 
word "hard," appearing first in the written 
text, is used as a prompt in a memory search 
for better descriptors. Here the processes of 



writing connected sentences has led the writer 
to a point where his high level plans were not 
well specified, and his written sentences ac- 
tually help achieve the goal by providing a 
prompt, a jumping-off point from which to 
begin some new semantic planning. In these 
two excerpts we see different types of text-level 
processing, and the interaction among them is 
striking. In Figure 1 the writer knew the 
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concepts that he wanted to express and even 
nviny of the specific lexical items; however, he 
had not worked out the syntactic frames for 
those lexical items. In the course of the protocol 
we set the writer work out an appropriate syn 
tax, guided by constraints from the lexical level, 
constraints against too much word repetition. 

In Figure 2 something slightly different oc- 
curs, Tlicrc is a general semantic plan for the 
section (i.e., to explain the connection be- 
tween cabernet grapes and a wine's longevity), 
but the writer has neither the specific syntax, 
concepts, nor the lexical items to express them. 
As we see in the sentence fragment that ends 
the fi .t segment of text, the writer seems 
simp'y to begin the new sentence, building it 
on tnc sentence just completed, and he goes as 
far as he can with it. His first pass at the 
sentence is not in the form he will actually use, 
but that first expression gets him into the ap- 
propriate semantic field and enables further 
refinement of the semantic plan. 

In this second excerpt we see an example of 
" text-based" writing used to its fullest advan- 
tage by an expert writer, and it illustrates how 
truly interactive writing can be. This writer 
began with a well-formed plan of the general 
structure of the column and of the audience 
who would read it. Even experts, however, 
cannot plan in advance every elaborative detail 
that might become appropriate as the text 
develops. As we see in Figure 2, the text itself 
can influence phrasing and can even prompt 
the wtiter to pursue an idea that was not 
salient in the initial plan. 

Thus, to accurately describe the behavior of 
writers, planning systems such as those 
developed within Al must be altered substan- 
tially. The hierarchical models that are powerful 
enough to solve relatively complex problems do 
not fit very well the task of generating coherent 
texts. Even descriptions of human experts per- 
forming the superficially similar task of software 
design (Jeffries et al., 1981) seem quite dif- 
ferent from expert writing performance. The 
solution processes of software design experts 
seem somewhat hierarchical in nature, while 
those of writers are much more interactive. The 
differences are probably due to the symbolic 
codes required in the two tasks. Writing com- 
puter code is not like writing natural language 
because the syntax of computer code is fixed. 
Once the general semantics of a computer pro- 
gram have been worked out, translation into 
code may be a rather trivial problem, at least for 
programmers of reasonable skill. For the writer 



of natural language, however, syntax is not fix- 
ed. There are a variety of lexical and synractic 
forms that can be used to express the same 
general semantic concepts (as our columnist 
demonstrated). Most important, those various 
syntactic forms render the semantic concepts 
no longer exactly equivalent. A passive con- 
struction, for exampie, signals a different 
sentence focus than does an active construc- 
tion. Because natural language permits 
nuances of meaning that computer codes do 
not, rearranging syntax can result in subtle 
changes in theme or foregrounding that can af- 
fect the reader's comprehension (Chafe, 1972; 
Halliday, 1967; Halliday * Hasan, 1976; 
Lesgold, Roth, & Curtis, 1979). 

In writing, unlike some tasks, decisions at 
the most detailed level of word choice and 
sentence construction can have large effects on 
abstract goal outcomes such as tone, perspec- 
tive, and audience. This is because those goals 
are fully achieved only at the most local level. 
For example, our wine columnist proposed to 
continue his explanation of wine's longevity 
with a discussion of esters and aldehydes. 
Reconsidering his purpose and audience, he 
chose instead to refer to "smells and flavors," 
because he wanted only a brief reference to 
those concepts. The concepts themselves were 
not ruled out by his general plans, but the 
writer had to decide on which aspect of the 
concepts to focus (on their chemical basis or 
their perceptual qualities) and on the cor- 
responding lexical labels ("esters and 
aldehydes" or "smells and flavors"). Inap- 
propriate choices at that final level of specified* 
tion could have undermined his plans concern- 
ing audience and purpose. 



Language-Based Approaches 

Because of the distinctive features of 
natural language production, linguistic tasks 
may be much more like one another than they 
are like other problem solving tasks. Thus, 
work in reading and speech production may 
provide additional models for studying 
writing. In work on reading comprehension, 
two perspectives have emerged, and while they 
are often viewed as adversative, they are 
actually rather complementary. The "top- 
down" approach emphasizes the importance 
of the reader's knowledge and its schematic 
organization, while the "bottom-up" ap- 
proach emphasizes lower level linguistic 
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processes. ; The model emerging from reading 
research reconciles these approaches hy em- 
phasizing interactions among processes at all 
levels. Thus, both top-down and bottom-up 
processes have been shown 10 contribute to 
successful comprehension, and there may be 
corresponding contributions, as well as interac- 
tions, at each level in the case of writing. 



Top down Processes 

Top down approaches to reading. Top-down 
information plays an important part in reading, 
for example the role of schemata in guiding 
comprehension. A schema is a hypothesized 
knowledge structure that connects events or 
concepts in some organized arrangement. 
Popular examples of such schemata are the col- 
lections of events that typically make up a story 
(setting, characterization, complication, and 
resolution) or a visit to a restaurant (ordering, 
eating, paying, and leaving). 

The usefulness of the organizational proper- 
ties of schemata for comprehension and recall 
has been repeatedly demonstrated. Texts 
ordered according to typical narrative schemata 
are consistently better recalled than those with 
unusual orders (Stein 6c Nezworski, 1978; 
Thorndyke, 1977), and recall of discourse that 
does not follow the ideal schematized order 
tends to be restructured more in accordance 
with that order than the actual order of input 
(Bower, Black 6c Turner, 1979; Mandler, 1978; 
Stein 6c Nezworski, 1978; Thorndyke, 1977). 
Recall also suffers when appropriate schemata 
are unavailable (F. C. Bartlett, 1932; 
Bransford 6c Johnson, 1972), and recall is bet- 
ter when events in stories are logically related 
rather than loosely temporally ordered (Ander- 
son, Spiro, 6c Anderson, 1977, in Anderson, 
1978; Black* Bern, 1981; Brown, 1976; Kint- 
sch, Mandel, 6c Kozminsky, 1977). 

In addition, a concept's probability of recall 
or inclusion in a summary increases when that 
concept is judged (by various indices) to be 
more important or more central to the schema 
(Brown 8c Smiley, 1977; Johnson, 1970; 
Omanson, 1982; Rumelhart, 1975). Similarly, 
the schema instantiated during comprehension 
can have dramatic effects on which concepts 
are recalled and on their interpretation 
(Anderson, Reynolds, Schallert, 6c Goetz, 
1977; Pichert 6c Anderson, 1977. in Anderson, 
1978). Comprehension has, in fact, been 
defined by some as instantiating the 



appropriate schema and mapping the incom- 
ing information onto the various slots (Collins, 
Brown & Larkin, 1980; Rumelhart & Ortony, 
1977: Schankft Abelson. 1977). While other* 
argue that there is more to comprehension and 
reading skill than top-down knowledge 
(Perfetti, in press; Perfetti 6c Roth, 1981, 
Stanovich, 1981), it is generally acknowledged 
by researchers from all perspectives that 
schemata are quite useful in organizing new 
information, in relating it to the reader's 
general knowledge during reading, and in ac- 
cessing that information during recall. 

Schemata in composition: applications and 
/imitations. Thr role of schemata in memory 
v *cess and retrieval during reading suggests 
that schemata may also be useful in writing. 
There is some suggestion that schemata can act 
as regulators for the arrangement of text 
elements in original text generation, as well as 
recall (Paris, Scardamalia, & Bereiter, 1980, in 
Bereiter 6c Scardamalia, 1981; Stein 6c Glenn, 
1979; Waters, 1980). Meehan (1981) has 
found schemata to be necessary knowledge 
components of his story-generating computer 
program, and Black, Wilkes-Gibbs, and Gibbs 
(1982) describe how schemata (and deviations 
from them) can help a writer determine an ap- 
propriate level of detail to focus interest, create 
drama, and hold interest. 

Certainly a key role for schemata during 
writing is the activation of relevant schematic 
content. Schemata may be very much in- 
volved, therefore, when writing becomes a 
process of discovery of ideas rather than mere 
transcription, and there may be something to 
that common observation, ,f I don't know what 
I think until I write it down " Schemata may 
aid in memory search, since they contain 
pointers to yet unaccessed information in the 
writer's memory and thus facilitate retrieval of 
topic relevant information. 



'Yhe bottom-up approach to reading has often been 
characterized as primarily emphasizing decoding and other 
word-level processes, and much of the work on individual 
differences in reading ability has indeed focused on the 
importance of such low*level processes (e g . Hunt. 
Lunneborg. & Lewis, 1975; Perfetti & Logo Id, 
This, however, has led to a misconception: that in the 
bottom-up approach, decoding is all there is to com- 
prehension. On the contrary, these lower points in the ver- 
bal processing chain are emphasized only as potential proc- 
essing weaknesses that (.an. if not automated, drain 
lognitive resources away from the text-level, integrative 
processes that are critical for comprehension (Perfetti. in 
press; Perfetti & Lesgold. 1M77). 
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Oikc relevant semantic content is activated, 
there still remains much tor the writer to do in 
terms of translating clusters of semantic 
knowledge into actual tevt. An attractive 
feature of schemata is that they deal with 
.semantic information at an abstract level. 
However, the processes of translating semantic 
concepts and relations into grammatical 
natural language sentences are not well 
specified in the schema-oriented work. Seman- 
tic relations are well expressed by propositions 
— relations between predicates and nouns — 
but even when relations among semantic con- 
cepts have been irganized into lists of proposi- 
tions, there is still no natural language text. 
The utility of a propositional representation is 
thar it can be mapped into a variety of 
linguistic expressions, all paraphrases of each 
other (see Kintsch, 1974; Kintsch & van Dijk, 
1978). Generating those linguistically 
specified alternatives and choosing the one 
most appropriate in a given linguistic context 
comprise much of the writer's job, and that 
job is often not an easy one. 

As our wine columnist found, it is not 
always easy for the writer to choose the linear 
syntactic arrangement that best expresses the 
conceptual re'ations and still honors local con- 
straints, especially since the writer is rarely 
generating a single sentence in isolation. The 
writer usually tries to generate a connected 
discourse and is thus foced to deal with how 
extended texts ' work," that is, how the 
specific wording of the message places some 
concepts in the foreground, others in the 
background, and integrates them all. Thus, in 
addition to the insights into writing gained 
from schema-oriented, top-down models of 
comprehension, there is much to learn from 
work that focuses on how meaning depends on 
the specific wording of texts and how specific 
wordings can affect processing. 

Bottom up Processes 

Role of linguistic text features in reading. 
The bottom-up approach to reading com- 
prehension has focused on the text itself and 
has emphasized many concepts developed in 
linguistics. For example, linguistic ideas of 
sentence perspective have been discussed at 
length by Halliday (1967; Halliday & Hasan, 
1976). Halliday (1967) distinguished several 
related concepts: information focus, which 
is indicated by tonal groups in speech; 
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themattzation, realized I »y order of clause con- 
stituents; identifications realized by special 
markings of "identified and identifier" in 
cleft and pseudocleft constructions; and 
given /new, which is based solely on whether or 
not specific information ha* been previously 
presented in the discourse. Many of these ideas 
have been incorporated into theories of com- 
prehension (e.g., Just & Carpenter, 1980; 
Perfetti & Lesgold, 1977), and they have been 
the subject of empirical investigation. 

In one such line of research, Clark and 
Haviland (1977) proposed a model of con- 
nected sentence understanding which they call 
the given/new strategy. The listener or reader 
attempts to match the given information in 
each sentence with some information already 
in memory. If that match i* successful, the new 
infoimation is added to Memory. If, however, 
the match is unsuccessful, added processing is 
required to make a bridging inference or 
restructure the original given/ new assignments 
in the sentence. Reading times lend plausibili- 
ty to such a hypothesis. Reading times were 
found to be shorter when syntactically in- 
dicated sentence parsings were appropriate to 
the given /new semantics of the passage. 
Similarly, Hornby (1974) found indications 
that cognitive processing was influenced by the 
linguistic presuppositions of the sentence syn- 
tax, and Sanford and Garrod (1981) have pro- 
posed a model of comprehension that deals, at 
the level of specific wording, with such text- 
based processes as inferencing and assigning 
pronominal reference. 

Chafe (1972) discussed the related concept 
of foregrounding, which entails the linguistic 
"staging" of certain lexical items and allows 
their being treated as given in the following 
utterance. Translated into cognitive processing 
terms, foregrounding helps to mark some lex- 
ical items for inclusion in STM while others are 
backgrounded. Thui:, reading times should 
decrease for foregrounded information, and 
this was found to be the case (Lesgold et aI M 
1979; Sanford & Garrod, 1981), 

Lingusitic text features: implications for 
writing. The implications of this work for 
writing are two-fold. First, the writer should 
want to create texts thai most effectively com- 
municate ideas to the reader. Thus those 
linguistic features of text that affect a reader's 
processing should be important to the writer as 
well. It may not be the case, for stylistic or 
other reasons, that the writer consistently 
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makes the reader s job as easy as possible. 
However, as the writer searches for various syn- 
tactic constructions, he or she should be con- 
tinually .ware, at some level, of the subtle 
changes in thematization or information focus 
that are cued by alternative syntactic construc- 
tions. A writer's control over local text features 
an thus affect the quality of the written pro- 
duct, and analyses of local text coherence has 
shown that such local control seems to contri- 
bute to developmental differences observed in 
writing by children (McCutchen & Perfetti, 
1982) and to perceived quality differences 
in writing by college students (Witte & 
Faigley, J.981). 
The second implication of these linguistic 
1 text features concerns the processing of the 
writer rather than the reader. The writer, after 
all, becomes a reader during the repeated 
cycles of generating, translating, and reviewing 
that comprise the process of writing (Hayes fit 
Flower. 1980), and thus the writer can be af- 
fected by many of the same text features. 
Pertctti and Goldman (1975) observed that 
readers' preferences for syntactic alternatives 
was indeed influenced by the syntactic 
thematization of sentences that preceded 
them. We also saw evidence of this interaction 
between the developing text and the writers 
more general semantic plans in the writing of 
our wine columnist (see Figure 2). 

The writer's realization of how a text is 
working linguistically can be very useful. With 
this information, the writer can understand 
the syntactic reasons why a text seems to be 
"going nowhere' 1 or even going somewhere 
the writer does not intend. Understanding the 
syntactic reasons for the problems, the writer 
may then know better how to solve them. 

The ability to ultimately solve such writing 
problems may critically depend on the writer's 
fluency in the processes of linear sentence 
generation: encoding concepts into actual lex- 
ical items, formulating clause-level syntactic 
arrangements, and then morphologically 
manipulating the lexical items to fit the syn- 
tactic frames. It is fluency in linear sentence 
production that aids manipulation of 
sentences and thus ideas. Just as the imposi- 
tion of high level schemata may organize infor- 
mation in interesting and sometimes unex- 
pected ways, lexical and syntactic manipula- 
tions at the local text level may also result in 
fresh juxtapositions of concepts that the writer 
can then evaluate for style, clarity, direction, 
or even plausibility. 



Only with reasonable fluency and cognitive 
efficiency in processes at the local text level, 
however, can the writer afford to play such ex- 
perimental linguistic games with the text. 
M Writing as discover/" is simply too cognitive- 
ly expensive for the writer with limited fluency 
in linear sentence processing. Young writers 
might be at a special disadvanr age here not on- 
ly because of their limited syntactic fluency bur 
also because of their limited syntactic reper- 
tory. Even when children can recognize flaws 
in their writing, they often cannot propose 
alternative constructions that remedy the prob- 
lems (E. J. Bartlett. 1982. Bereiter & 
Scardamalia, 1981). The writer may even, in 
some sense, know alternative word choices or 
syntactic constructions, but when sentence 
production is cognitively inefficient the 
generation of sentences may proreed on a 
"first come, first served'* basis, 
regardless of appropriateness within the 
specific linguistic context. This too 
can be problematic, especially for the young 
writer, since studies by Bracewell and 
Scardamalia (reported in Bereiter & Scar- 
damalia, 1981) showed that children have par- 
ticular trouble linguistically recasting sentences 
when alternative linguistic forms of the 
sentences are present. Thus, if text -level pro- 
cesses are not well under control, the writer 
may simply not risk local manipulations, and if 
they are attempted, faulty local processing may 
result in the errors so typical of problem writers 
(see Bartholomae, 1980; Daiute, 1981; 
Shaughnessy, 1977). 

It is the very fluency of most writers' linear 
sentence processing, successful or not, that 
may make it difficult to identify their impor- 
tance in the writing process. Our wine colum- 
nist was extraordinarily verbal about some of 
his text-level decisions, but this was not true in 
much of his protocol, nor in the protocols of 
many other skilled writers. In a study of pauses 
during writing. Flower & Hayes (1981b) found 
that higher level, rhetorical goals correlated 
better with pause- bordered episodes than did 
focal, sentence-level decisions. This is not sur- 
prising. Especially for the adult writers in that 
study (several of whom were classified as expert 
writers), one might expect that sentence-level 
decisions would not account foi large propor- 
tions of pause time, compared with rhetorical 
decisions. These writers may be so fluent with 
local text manipulations that those sorts of 
decisions are very rapid and not as available 
for report. 
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PROTOCOL 



TEXT 

1) Roller-skating is fun and necking. 

2) Because you can skate with alot of people. 

3) There's many different places to skate. 

4) To roller-skate you use tennis shoes with 
wheels. 



That doesn't sound good at all! . . . Til have 
to start the sentence with different, urn, dif- 
ferent words because, urn, it says 'There's 
many different places to skaie M nnd it really 
doesn't fit right there. "There's also rr.any 
different places to skate. 11 



(edits) 

3) There's 
skate. 



also many different places to 



Figure Fourth grade writer's protocol and text as the writer explicitly coordinates 
adjacent sentences. 



Text level processes m yovng wnters. For 
writers with less skill and less experience, 
however, such sentence- level decisions are not 
so fluent, and those decisions are observable 
parts of the writing process. A close look at a 
spontaneous editing session of a fourth grader 
gives some insights into her writing process 
and into aspects of performance not observable 
in more fluent writers. 

In Figure 3 and 4 we see, on the right, the 
text produced so far in the writing session and, 
on the left, the writer's comments at that 
point The sentences of the text have been 
numbered here for ease of reference. 

In Figure 3, the writer has reread her first 
four sentences and is dissatisfied. She is strug- 
gling with local coherence problems between 
sentences 2 and 3, and she solves them with a 
strictly local, sentence-level change. She inserts 
the word "also" into sentence 3 to explicitly 
mark the coordination of ideas between 
sentences 2 and 3. As her protocol reveals, 
text-level decisions take much of this writer's 
attention as she tries to generate sentences that 
"fit" with their neighboring sentences. 

In Figure 4, the writer has deleted sentence 
4 altogether, after several attempts to reword it 
and * 'start it out different." Her comments 
show that she has decided what semantic con- 
tent she would like to include in her next 
sentence, but the local decisions of "how to 
write it" are very difficult for her. She has 
decided to extend her discussion of "places to 
skate" by mentioning streets and hills, but she 



struggles at the level of phrasing and questions 
how explicit she must make the link between 
"places" and "streets and hills." 

For this young writer, whose linear sentence 
processing is not fluent, text- level decisions are 
very prominent aspects of the writing process. 
In addition, young writers are notorious for 
their lack of high level planning (Burtis et al., 
in press; Scardamalia & Bereiter, 1982), and 
thus their text-level decisions might be even 
more difficult, operating without the guidance 
of sup vrordinate plans. For ler, more fluent 
writers, such text-based processing may no 
longer be prominent aspects of writing, ob- 
servable in protocols, but they certainly must 
remain important parts of the process of 
sentence generation and thus important parts 
of writing. 



Speed Production Processes 

The writer's job, in some respects, is not 
unlike the speaker's job: The goal of both is to 
generate a linguistic expression. Of course, 
unlike the speaker who produces a transient 
acoustic signal, the writer produces an endur- 
ing written transcript that can be reexamined 
and edited to improve its fit within a given 
context. With the luxury of revision, the writer 
can alter the text so as to most effectively com* 
municate with the reader. 

Like speech, however, the writing of sen* 
tences requires encoding semantic concepts 
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PROTOCOL TEXT 

1) Roller-skating is fun and exciting. 

2) Because you can skate with alot of people. 

3) There's also many different places to 
skate. 

I'm trying to figure out how ,o write it— how 
to put it down— to fit with, urn- Ser , it'll fit 
more with, urn, "many different places to 
skate " It'll fit with that, like, 'cause, urn, 
flat hills are different places— well streets are 
different places, hills are different places . . 
. I'm trying to get this sentence— wHI, it'll 
fit this sentence, but should I write urn, 
"Hills are steep, and they're scarey"? Would 
that make sense — to make— with this 
sentence? Or just write "Hills are— Hills and 
streets are different places to skate"? 

Figure 4. Fourth grade writer's protocol and text as the writer works out syntactic frame for chosen 
conQepts. 



into actual lexical items and arranging them in 
grammatical sequences that best express their 
semantic relations. Research rn speech produc- 
tion shows that the process is not one of direct 
translation. Linear sentence generation re- 
quires much interplay among semantic, syn- 
tactic, and lexical levels, as evidenced by 
speech error data (Fromkin, 1973, 1980; 
Garrett, 1981; Levelt, 1983). 

Various kinds of speech errors suggest that 
there are multiple stages in the process of 
sentence production. For example, word ex- 
changes, as in sentences (1) and (2) (from 
Garrett, 1981), tend to be between words of 
the same grammatical category, suggesting 
some syntactic framing had occurred prior to 
the point at which lexical items were inserted 
into the frame. 

(1) Older men choose to tend younger 
wives. 

(intended: tend to choose) 

(2) Write a request for tickets at two for the 
box office. 

(intended: tickets for two at the box of- 
fice) 

Other errors, called stranding errors, suggest 
that the bound morphemes marking gram- 
matical function are partly independent of the 
lexical items with which they are paired and 
are perhaps connected more intimately with 
the syntactic frame itself. In these errors, word 
stems exchange places but leave behind, 
"stranded" in the original syntactic position, 
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the bound morpheme that serves as the gram- 
matical marker. Further, as shown in sentence 
(3) (from Garrett, 1981), very late in the pro- 
duction process those stranded morphemes are 
accommodated to their tew phonological en- 
vironment. 

(3) It waits to pay. 

hi 

(intended: pays to wait) 
111 

Such errors have prompted theories of 
speech production that involve several rapidly 
executed stages in which general semantic con- 
tent is chosen first, then individual concepts or 
"lemmas." Then clause-level syntactic frames 
are specified, individual lexical items retrieved 
(corresponding to the semantic lemmas), and 
finally some morphological adjustments made 
to fit the words into the specified frame. 

Protocols from writers such as our wine col- 
umnist also suggest the existence of several 
levels in the sentence production process. In 
the excerpt in Figure 2, the writer was search- 
ing for semantic concepts and words to capture 
them, while in Figure 1, he had the concepts, 
words, and much of the syntax but was work- 
ing on subtle syntactic refinements that avoid- 
ed word repetition. 

Although we do not usually see word ex- 
changes and stranding trrors occurring in 
writing, speech production models have been 
applied to writing ,vith some interesting results. 
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For example, our theory of speech production 
(Bevrr, Carroll, & Humg. 1M76) explains 
many syntactic errors, such as sentence (4), in 
terms of overlapping syntactic frames that, 
when combined, result in an ungrammatical 
merging of two potentially grammatical se- 
quences. Sequences (4a) and (4b) ar both 
grammatical, but when connected via their 
common segment . "I understand . " they 
violate syntactic rules. 

(4) 1 really enjoyed flying in an airplane that 
1 understand how it works. 

(4a) 1 really enjoyed Hying in an airplane that 

1 understand. 
(4b) 1 understand how it w^rks. 

Daiutr (1981) has applied similar analyses to 
students' written sentence errors and was able 
to account for a large proportion of the 
students' syntactic errors. 

Conclusions 

The argument presented here has some 
precedents. (See Bracewell, 1980, for a related 
discussion.) In fact, a similar point has been 
made by Flower and Hayes (1981b), major 
proponents of the problem solving approach to 
writing whose recent work focuses primarily on 
the role of planning in writing skill: 

an important part of being a skilled writer is know- 
ing not only how to do thi* rhetorical planning, but 
how to embed interne -level planning within it — 
how to turn intentions and knowledge into text. 

So much empirical attention, however, has 
been focused on the planning component of 
their problem solving model that the translating 
component often seems trivial in comparison. 
Studies in speech production and reading com* 
prehension remind us that generating 
bnguage, even with the help of appropriate 
plans, is a nontrivial task and that linguistic 
features of the text affect processing in impor- 
tant ways. Thus the translating of plans and 
goals into text is an important part of the 
writing process, and the interaction between 
higher-level plans and linguistic features of the 
developing text are a worthy research focus. 

A focus on the linguistic nature of the 
writing process will prompt research to address 
questions that are differenct from those posed 
in a planning-oriented view. For instance, does 



sentence generation during writing follow the 
course hypothesized in speech production 
studies? It may be the case that the slowed 
pace of writing and the reflection it permits, 
combined with the written transcript it leaves 
behind, alter the process. In the less transient 
environment of written text, sentences occur- 
ing earlier in the discourse may affect on-line 
productions in ways that spoken sentences can- 
not. On the other hand, writing might not 
substantially change the process; it might 
simply make it easier to track and thus help 
refine theories of sentence production. 

When the focus is on how semantic content 
is translated into language, issues also arise as 
to how (and how well) coherence among the 
semantic concepts is represented through 
coherence in the text itself. Such issues include 
how do linguistic devices maintain textual 
coherence and how intimately are such 
linguistic devic »t tied to the semantics of the 
content. Since .rilled writers seem to be able 
to transfer at lew.: part of their skill across 
knowledge domains, one might be tempted to 
hypothesize that seme aspects of coherent 
writing arc independent of content. This im- 
plies that, in an effort to create a coherent text, 
the good writer somehow recognizes areas of 
ignorance and (a) either avoids or 4< writes 
around" them, or (b) clears them up in the 
process of writing. The second alternative is 
clearly the most interesting. Many writers have 
had the experience of crystallizing ideas only 
once 'hey begin to write them down, and 
the role that language generation itself 
plays in this process is a most in- 
triguing question. 

Also interesting is the development of the 
ability to view language as separate from the 
content it expresses. How and when does the 
writer, or the language user in general, begin 
to represent language as opaque, as something 
that can be crafted to better express given 
semantic concepts rather than just a 
transparent window on those concepts? For the 
novice language user, the emphasis is usually 
on the message, but the writer must focus on 
the linguistic expression of that message as 
well. Understanding how "what is said" dif- 
fers from "what is meant" is a critical part 
of writing. 

Thus, focusing on writing both as a text- 
driven linguistic task and as a planning 
task, writing researchers may begin to get a 
more comprehensive understanding of the 
writing process. 
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